Scope

This document aims to perform an integrated analysis on all available cells that passed the QC filtering. Results obtained in this analysis will determine if multilineage can be observed in any of the conditions. Samples have been previously been assigned a cell type using SingleR (Aran et al. 2019) and the (Bone Marrow) Mouse Cell Atlas (Wang et al. 2022).

Sample integration is required since there is no perfect overlapping between both samples.

Primary data

A total of two samples (Bone Marrow) were library-prepared and sequenced by CRG Sequencing platform at PRBB. Libraries were prepared using 10x Single Cell 3’v3 kit. Corresponding biological samples were delivered to the sequencing platform in December’24 and sequencing data was received in February’24. Each sample corresponds to a different condition:

  • CD45+ cells at 4.5 months (CD45_4_5)
  • CD45+ cells transplanted from KDR+ progenitors at 4.5 months (CD45_BFP_4_5) activated from 7* specific genes identified by CRISPRa in an earlier project phase.

*Genes related to HSC development fate.

Methods

Following criteria is applied:

  • SCTransform-based normalization (v2) regressing out mitochondrial content. Rest of parameters by default.
  • Dimensionality reduction considering 35 PCs which corresponds to a total cumulative variance explained of 85%.
  • Clustering (Louvain algorithm) with resolutions explored: from 0.1 to 0.6 (less to more specific).
  • ‘FindAllMarkers’ was used to identify differentially expressed genes for each cluster and condition.

All previous steps and integration are performed by means of Seurat R package (Aran et al. 2019) (v5.0.0) together with glmGamPoi R package (Ahlmann-Eltze and Huber 2021) (v.1.12.2). Check vignette https://satijalab.org/seurat/archive/v4.3/sctransform_v2_vignette

Results

As a reminder, the dataset to be analyzed contains 17396 features (genes) over 12530 cells distributed in 2 samples from two conditions. In this case, ribosomal genes were excluded from the dataset. Low quality cells were also discarded as a result of a previous QC.

The UMAP shows a great overlap between samples after the integration.

This representation confirms that some cell types are predominant in the non transplanted cells (Macrophages) and others in the transplanted cells (B cells and T cells).

It is recommended to visualize UMAP against other possible confounding variables i.e. mitochondrial content or number of genes detected per cell. This is shown in next figures:

Cell phase

Finally, cell phase can be inferred from a reference set of genes defined by (Tirosh et al. 2016) and phase labels represented in UMAP, as shown in following figure:

Since this dataset includes developing cells, it is not recommended to regress out this information.

Seurat Annotation

Different clustering resolutions are explored from lower (less specific cell type) to higher granularity (more specific cell type).

Clusters can be represented over UMAP, the following graph show this layout for clustering resolution 0.4.

A clustering resolution of 0.4 is chosen. Let’s visualize the clusters per condition:

The distribution of cells among clusters and between group is:

##     
##      CD45_4_5 CD45_BFP_4_5
##   0      1370          466
##   1      1008          616
##   2       957          429
##   3       157         1142
##   4       706          430
##   5       657          125
##   6       519          211
##   7       501           92
##   8       378          200
##   9       426           87
##   10      305          141
##   11      181          156
##   12      121          204
##   13      117          183
##   14        9          272
##   15       34          241
##   16       64           25

Subclustering

There are some cell types not present in the Mouse Cell Atlas that can be observed using different Hematopoietic markers. The manual re annotation process is as follows: - Plot markers of an specific cell type and observe positive cells in one seurat cluster (res 0.4) - If all the cluster is positive for this marker, it is re annotated. - If the markers are partially expressed, a subcluster is calculated to fit the positive cells.

MAGIC imputation

MAGIC (Dijk et al. 2018) is an imputation algorithm used to restore the structure of the data. We will use this tool to visualize the expression of different hematopoietic markers. This is an example of how the algorithm works:

The plot on the right shows how the imputation has revealed the expression of the gene “Gata3” in one cluster.

Hematopoietic markers

From Kucinski et al, 2024: https://doi.org/10.1016/j.stem.2023.12.001

HSC_markers

This signature is expressed in cluster 11, and therefore it has been tagged as “Hematopoietic stem cell”.

Neutrophil_Progenitor_markers

This signature is expressed in cluster 9, and therefore it has been tagged as “Neutrophil progenitor”.

Basophil_Progenitor_markers

This signature is expressed partially in cluster 11. After subclustering, cluster 11_6 fits the positive cells, and therefore it has been tagged as “Basophil progenitor”.

ILC_markers

This signature is consistently expressed in cluster 14, and therefore it has been tagged as “Innate lymphoid cell”.

Curated.cell.ident

After re annotation, cell types are represented as follows:

General Cell Type Annotation

These 20 cell types were summarized into 9 main cell types (B cell, Basophil, Erythroid, HSC/HSPC, ILC, Macrophage, Myeloid cells, Neutrophil and T cells).

Seurat Re-Annotation

Finally, clusters 11, 12, 13 and 15 were subclustered to fit the final cell types.

References

Ahlmann-Eltze, Constantin, and Wolfgang Huber. 2021. glmGamPoi: fitting Gamma-Poisson generalized linear models on single cell count data.” Bioinformatics 36 (24): 5701–2. https://doi.org/10.1093/bioinformatics/btaa1009.
Aran, Dvir, Agnieszka P Looney, Leqian Liu, Esther Wu, Valerie Fong, Austin Hsu, Suzanna Chak, et al. 2019. “Reference-Based Analysis of Lung Single-Cell Sequencing Reveals a Transitional Profibrotic Macrophage.” Nature Immunology 20 (2): 163–72.
Dijk, David van, Roshan Sharma, Juozas Nainys, Kristina Yim, Pooja Kathail, Ambrose J Carr, Cassandra Burdziak, et al. 2018. Recovering Gene Interactions from Single-Cell Data Using Data Diffusion.” Cell 174 (3): 716–729.e27. https://doi.org/10.1016/j.cell.2018.05.061.
Tirosh, Itay, Benjamin Izar, Sanjay M Prakadan, Marc H 2nd Wadsworth, Daniel Treacy, John J Trombetta, Asaf Rotem, et al. 2016. Dissecting the multicellular ecosystem of metastatic melanoma by single-cell RNA-seq. Science (New York, N.Y.) 352 (6282): 189–96. https://doi.org/10.1126/science.aad0501.
Wang, Renying, Peijing Zhang, Jingjing Wang, Lifeng Ma, Weigao E, Shengbao Suo, Mengmeng Jiang, et al. 2022. Construction of a cross-species cell landscape at single-cell level.” Nucleic Acids Research 51 (2): 501–16. https://doi.org/10.1093/nar/gkac633.